Development of a prediction model for suicidal ideation in patients with advanced cancer: A multicenter, real‐world, pan‐cancer study in China

Abstract Background Patients diagnosed with advanced stage cancer face an elevated risk of suicide. We aimed to develop a suicidal ideation (SI) risk prediction model in patients with advanced cancer for early warning of their SI and facilitate suicide prevention in this population. Patients and Methods We consecutively enrolled patients with multiple types of advanced cancers from 10 cancer institutes in China from August 2019 to December 2020. Demographic characteristics, clinicopathological data, and clinical treatment history were extracted from medical records. Symptom burden, psychological status, and SI were assessed using the MD Anderson Symptom Inventory (MDASI), Hospital Anxiety and Depression Scale (HADS), and Patient Health Questionnaire‐9 (PHQ‐9), respectively. A multivariable logistic regression model was employed to establish the model structure. Results In total, 2814 participants were included in the final analysis. Nine predictors including age, sex, number of household members, history of previous chemotherapy, history of previous surgery, MDASI score, HADS‐A score, HADS‐D score, and life satisfaction were retained in the final SI prediction model. The model achieved an area under the curve (AUC) of 0.85 (95% confidential interval: 0.82–0.87), with AUCs ranging from 0.75 to 0.95 across 10 hospitals and higher than 0.83 for all cancer types. Conclusion This study built an easy‐to‐use, good‐performance predictive model for SI. Implementation of this model could facilitate the incorporation of psychosocial support for suicide prevention into the standard care of patients with advanced cancer.


| INTRODUCTION
Cancer represents a significant global health burden, with both its incidence and mortality rates rising sharply in recent years. 1,2Furthermore, research has revealed that cancer patients face nearly double the risk of suicide compared to the general population. 3An analysis utilizing data from the surveillance, epidemiology, and end results program found that between 1973 and 2014, 13,311 out of 8,651,569 cancer patients died by suicide, yielding a suicide rate of 28.58 per 100,000 person-years and a standardized mortality ratio of 4.44 (95% confidence interval, 4.33-4.55). 4Notably, patients with metastatic cancer had an even higher suicide rate. 5uicide is a tragic event that is often preventable if appropriate measures are taken to address suicidal ideation (SI) and behavior. 6In accordance with the ideation-toaction model of suicide, SI is characterized by thoughts, considerations, or plans regarding suicide, thereby sharing a common risk brought by suicidal behavior. 7,8Given that suicide behavior can be sudden and urgent, it is imperative to prioritize the identification of SI in the prevention of suicide among cancer patients.
A meta-analysis revealed that the prevalence of SI among cancer patients in Mainland China has reached nearly 25%. 9 The causes and risk factors for SI are complex and heterogeneous.Prior research has predominantly concentrated on identifying variables associated with a heightened risk of SI, which is called the traditional "riskfactor-identification strategy."However, an integrated model that incorporates multifaceted predictors to facilitate early screening and identification of patients at risk of SI is lacking. 10,11Therefore, it remains challenging for oncology clinicians to identify cancer patients with SI and provide targeted psychological intervention and treatment to prevent suicide.
Therefore, this study conducted a cross-sectional realworld study across 10 top-tier clinical oncology institutions in China, aiming to develop a pragmatic model to screen the risk of SI among advanced cancer patients and to provide a useful tool for early detection and intervention of SI in this population.

| Study subjects
Between August 2019 to December 2020, we enrolled consecutively patients aged 18 years and older who had been diagnosed with advanced lung, gastric, breast, liver, colorectal, and esophageal cancer from 10 leading cancer institutions that are situated in 10 representative provinces across China, encompassing the eastern (4 provinces), central (4 provinces), and western regions (2 provinces), with consideration given to both geographical distribution and levels of economic development (Figure S1).Patients with advanced cancer specifically refers to patients who have no treatment options that are curative.Patients with significant communication difficulties, cognitive impairment, or those deemed too frail to complete the questionnaire were excluded.
The study was registered as a clinical trial (ChiCTR1900024957) and has received the approval from the ethics committee of Peking University Cancer Hospital (2019YJZ34).All participants provided informed consent prior to their inclusion in the study.

| Clinical data collection
We developed an online real-time electronic Patient-Reported Outcome (ePRO) system for patient data collection. 12The patient data were uploaded through this platform and monitored in a real-time manner by designated personnel of this study.
Patients were enrolled upon their initial hospital admission.Following informed consent, all participants completed the Case Report Form including the demographic characteristics (age, sex, smoking history, etc.), clinicopathologic data (diagnosis, cancer stage, etc.), clinical treatment experience (surgery, chemotherapy, radiotherapy, etc.), aided by professionals in each center.The clinical data were gathered from electronic medical records.

| Evaluation of psychological status and suicidal ideation
Based on the ePRO system, we assessed the patient's SI, anxiety and depressive symptoms, as well as cancerrelated symptoms by using the MD Anderson Symptom Inventory (MDASI), the Patient Health Questionnaire-9 (PHQ-9), the Hospital Anxiety and Depression Scale (HADS) and Insomnia Severity Index (ISI) respectively.Meanwhile, data on self-reported life satisfaction was also collected.
The PHQ-9 is a commonly applied screening tool for depressive symptoms and SI, rooted in the Diagnostic and Statistical Manual of Mental Disorders, 4th Edition (DSM-IV). 13The Simplified Chinese version of PHQ-9 has been well-validated. 14This questionnaire comprises 9 items rated on a 4-point Likert scale ranging from 0 to 3.
The HADS is a 14 items questionnaire on a 4-point Likers scale (0-3). 15The Chinese version of HADS has been validated and proven reliable. 16Anxiety symptoms are assessed in odd-numbered items, while depressive symptoms are assessed in even-numbered items.The sum score for each symptom is ranged from 0 to 21, with a higher score indicating more severe anxiety or depressive symptoms.
The MDASI is a questionnaire consisting of 13 items designed for symptom assessment. 17The Chinese version has been validated and proven reliable. 18Each item scales from 0 ("Nothing") to 10 ("Most severity").Symptoms scored 5 to 6 were defined as "moderated," 7 to 10 as "Severe."We use MDASI score as a comprehensive factor for symptom burden.
The ISI is a 7 items questionnaire used for measuring the severity of insomnia in the past 2 weeks. 19The Chinese version of the ISI has been shown to possess good reliability and validity. 20Each item is scored on a scale from 0 ("not at all") to 4 ("very much").The total score therefore ranges from 0 to 28, where 0-7 indicates no insomnia, and 8 and higher indicates to insomnia.

| Data processing
Variables with more than 10% of missing values were not included in the analysis.The missing values of the rest of the variables were imputed using multiple imputations via chained equations, which generated five complete data sets.For each kind of variable, different imputation methods were applied.Predictive mean matching was used for numerical variables, logistic regression for binary variables, and polytomous logistic regression for categorical variables. 22Age was analyzed as a categorical variable, categorized by its percentile, while the number of household members and Eastern Cooperative Oncology Group Performance Status (ECOG PS) were categorized based on accepted cutoff values.To identify the critical candidate predictors, all collected variables were first evaluated using a univariable logistic regression model, and variables that were not clinically relevant to the outcome were excluded.Finally, a total of 23 potential predictive variables were included for subsequent analysis.

| Predicted outcome
We defined SI based on item 9 of the PHQ-9 scale, "Over the last two weeks how often have you been bothered by thoughts that you would be better off dead or of hurting yourself in some way?"The response options for each item of the PHQ-9 are "not at all" (scoring 0), "several days" (scoring 1), "more than half the days" (scoring 2), or "nearly every day" (scoring 3).Patients who reported such thoughts (i.e., scored 1 or above) were considered as having a SI.

| Variable selection and model construction
A two-step variable selection strategy was used to construct the prediction model.Considering the clinical relevance, we classified all potential predictive variables into four groups, including basic characteristics (age, sex, marital status, occupation, employment status, number of household members, cigarette smoking, psychological counseling experience, personal history of depressive disorder, and family history of depression disorder), disease status (ECOG PS, weight loss, interval between diagnosis, and evaluation), history of treatment (history of previous surgery, chemotherapy, radiotherapy, current treatment, and sides effect of treatment), and self-reported psychological status (life satisfaction, MDASI score, HADS anxiety [HASD-A] score, HADS depression [HADS-D] score, and ISI category).First, the variables in each group were included in a multivariable logistic regression model, and a backward stepwise selection based on Akaike Information Criterion (AIC) was used to identify candidate predictors.Second, the selected candidate predictors from these four groups were once again incorporated into a multivariable logistic model.Taking into account the important risk factors for both SI and cancer type, age and sex were forcibly included as variables in the logistic model.The predictors retained in the final model were determined through AIC-based backward stepwise elimination and by considering their clinical significance.Each imputed dataset was analyzed and the pooled coefficients, odds ratio (OR), and 95% CIs were obtained by the Rubin's rule. 23A nomogram, allowing for visual representation of individual risk of SI, was constructed based on the final model, with points assigned to each predictors determined by their regression coefficients.

| Assessment of model performance and validation of the model
We employed the Receiver Operating Characteristic (ROC) curve to assess the final prediction model's ability to discriminate high-risk individuals for SI.The area under the curve (AUC) was calculated based on the predicted and observed probabilities.Calibration curves were plotted to visually represent the agreement between observed and predicted risk. 24To evaluate the accuracy and generalizability of the model, internal validations in each imputed subset of different hospitals and tumor types were performed with R package "psfmi." 25

| Evaluation of model-based tailored screening
To assess the application performance of our model, we assumed a hypothetically tailored screening strategy wherein only patients with risk probabilities higher than a specific cutoff value were offered psychological supportive care.The highest predicted probability of achieving the anticipated population coverage in overall patients was selected as the cutoff value.This value was then used to evaluate the application performance of the model in patients with different cancer types.Sensitivity, the detection probability, and the detection probability ratio (as compared to all samples) under each coverage in different datasets were calculated.

| Sensitivity analysis
To evaluate the robustness of our main results, we performed a sensitivity analysis where the discriminatory ability of the final model was assessed in the complete dataset without missing values for any predictors.Additionally, we employed the k-nearest neighbor imputation method to construct and evaluate the model's performance.
All data processing and statistical analysis were performed using Stata 16.0 and R 4.1.2.Statistical significance was set at p < 0.05.

| Patient characteristics
A total of 2814 participants with advanced malignant tumors in 10 hospitals qualified for inclusion in the final analysis (Figure S2).Lung cancer was the most prevalent type of cancer diagnosed, comprising 24.0% of all patients, followed by breast, colorectal, stomach, esophageal, and liver cancer.Overall, 598 (21.3%, 95% CI: 19.8%-22.8%) of these patients were identified as having a SI based on item 9 of PHQ-9.Following a backward stepwise approach, 12 out of 23 variables were selected from the four designated groups, and the distributions of these variables were summarized (Table 1).The median age of the patients in this study was 57, with 58.4% (1643/2814) of them being male.Over half of patients with a household size of 1-3 persons had a history of smoking.Of the included patients, 1132 (40.2%) reported having undergone prior surgery, 1577 (56.0%) had undergone prior chemotherapy, and 440 (15.6%) had undergone radiotherapy.At the time of investigation, 1889 (67.1%) patients were on anti-cancer therapy (including surgery, chemotherapy, radiotherapy, immunotherapy, and others).The median scores of the MDASI, HADS-A, HADS-D and life satisfaction were 19, 5, 5, and 6, respectively.T A B L E 1 (Continued) coefficients, the HADS-A score had the largest effect on the risk of SI.

| Performance of the prediction model
The AUC of this model was 0.85 (95% CI: 0.82-0.87)(Figure 2).When applied to different subsets according to study centers and cancer types, the predictive model continued to demonstrate ideal performance, with AUCs ranging from 0.75 to 0.95 across 10 hospitals and higher than 0.83 for all cancer types (Figure 3).Calibration plots showed optimal agreement between model-predicted and actual probability for SI in overall patients and patients with different cancers (Figure S3).

| Application of the prediction model
The application performance of our model under different workloads of psychological supportive care was accessed by the assumed hypothetic tailored screening strategy wherein "high-risk" patients were selected for the referral.There was a similar trend observed across all patients and six types of cancer, where the detection probability increased greatly as the cutoff value was raised (Figure 4; Table S2), which demonstrated the capability of the model to identify and enrich high-risk patients with SI.For instance, in the scenario where only patients at the top 10% risk level would be referred for psychological supportive care, the detection probability would be over three-fold higher than the universal screening in any dataset.When we expected to cover more cases, like at least 80% of all cases (i.e., the sensitivity above 80%), only the top 40% of patients in the overall dataset were needed to refer (2.1-fold increase in detection probability), and similar results were also found in patients with different cancers.

| Sensitivity analysis
Sensitivity analysis was conducted in the complete dataset, in which 2303 (81.8%) available patients were included.The discrimination of the prediction model was consistent with the main analysis (AUC = 0.86, 95% CI: 0.84-0.88)(Figure 2).The model's structure and the corresponding AUC in the k-nearest neighbor imputated datasets demonstrated comparable results (Table S3).

| DISCUSSION
Suicide is a significant problem among cancer patients, and preventing it is a challenging task in clinical oncology.In accordance with the ideation-to-action framework of suicide, early identification and addressing SI is crucial to prevent the fatal outcome of suicide.Based on this, we examined nearly 3000 patients with advanced stages of six common cancers from 10 cancer centers in China, and for the first time, we built an easy-to-use model predicting the risk of SI among individuals diagnosed with advanced cancer.This model was established using cross-sectional data and is not intended as a prognostic model, but rather The nomogram for predicting the risk of suicidal ideation in patients with advanced malignant tumors.To calculate the risk of suicidal ideation based on the patient characteristics, first determine the point for each predictor by drawing a vertical line from that predictor to the top points scale.Then sum all of the points and draw a vertical line from the total points scale to the risk line to obtain the risk of suicidal ideation.HADS, hospital anxiety and depression scale; MDASI, MD anderson symptom inventory.

F I G U R E 2
Receiver operating characteristic (ROC) curves of the risk prediction model for suicidal ideation in multiple imputed dataset (red color) and complete dataset (blue color) among patients with advanced malignant tumor.as a diagnostic tool to identify high-risk individuals and facilitate timely psychological intervention.By applying this model to clinical practice, we hope to provide a warning of SI and mitigate the risk of suicide among cancer patients.
Our model indicated that older and female patients had a higher risk of SI compare to younger and male patients.These results contradict with the findings of a US research 19 that greater suicide risk is correlated with patients in male, age group 60-69.One reason for the discrepancy in results may be due to the heterogeneous of subjects in two studies.In Chinese society, the older and female patients may be more vulnerable. 20With advanced cancer, this population may receive less social support and need more attention.In addition, data from current study indicate that the risk of SI was lower in patients with fewer household members.This result is consistent with the research conducted by Zhou et al. which suggests that family adaptability and cohesion serve as protective factors against SI in Chinese cancer patients. 21hese findings emphasized that adequate social and family support is very important for patients in the face of advanced cancer.Facilitating social and family support should be one of the interventions to avoid developing SI in this population.
The history of anti-cancer treatments, such as chemotherapy and surgery, was retained in the final model, in which, the history of chemotherapy was associated with an elevated risk of SI.This agreed with the findings of previous studies that the side effects of chemotherapy had huge impact on patients in both physical and psychological ways. 26,27However, the current study indicated that the history of cancer surgery was the protective factor of SI.It might be due to the better expectation of disease prognosis among patients who underwent radical surgery.This finding provides new perspective on the correlation between SI and cancer treatment.Areas for future research include the psychological impact of surgery in postoperative period among cancer patients.Consistent with previous studies, physical symptoms and psychological distress predicted an increased likelihood of SI among patients with advanced cancer. 28,29revious research suggested that quality of life could act as a mediator in the association between symptom burden and SI, 4 and thus active symptom control, for example, analgesia, to improve the overall life quality should be essential to reduce SI.From the nomogram, anxiety and depression contributed the most to the model, which underlined the important role of anxiety/depression in the formation of SI.Therefore, it is imperative to enhance the screening, evaluation, and management of cancer-related symptoms and psychological distress in oncology clinical practice.In most cancer patients, symptoms such as pain, distress, or fatigue usually present as a cluster, 30 and one single symptom cannot represent the overall symptom burden.In this study, we used the total score of MDASI as a variable to predict SI, which mainly represents the overall symptom burden of patients.
In the present study, our model showed favorable accuracy (AUC = 0.83) in the overall dataset and desired robustness and homogeneity were found in the 16 sub-cohorts including the 10 study centers and six cancer types.Calibration plots for the risk of SI overall and each cancer dataset were all close to the 45-degree line.Moreover, this model yielded consistent risk enrichment trends in patients with different regions and cancer types, which suggested the ideal adaptability and generalizability of the model when applied to different cancer patients and regions.
The model established in this study could serve as a valuable resource for oncology clinicians to identify advanced cancer patients at high-risk of SI, facilitating more precise referrals.For example, if the cutoff is set at 0.709, there will be 17.9% of advanced cancer patients have been diagnosed as the high-risk subgroup for SI, which is a 3.6fold increase over the average for the entire sample.When it was applied in real-word scenarios, different cut-off values should be adopted according to their psychosocial service capability.
In current clinical practice, suicide prevention is mostly reactive, the intervention is provided when the suicide behavior occurs.However, implementing this predictive model to identify patients at a heightened risk of SI and to give them intervention timely could change the reactive model into a proactive one, which could also facilitate the integration of psychosocial care into standard cancer care practices, to identify the patients' psychosocial distress more actively rather than waiting for their report passively, and to let more patients access the psychosocial care they needed.
We acknowledge that there may be information bias introduced into the data collection, as SI was assessed using the suicidal item within the PHQ-9 scale rather than the Columbia-Suicide Severity Rating Scale Screener (C-SSRS), which is widely regarded as the gold standard for evaluating SI 31 However, in traditional Chinese culture, the investigation of SI may cause discomfort or even resistance among participants due to the sensitivity and taboo of death.Therefore, hiding the suicide information among other items of the comprehensive scale may reduce the psychological burden and defense mechanism of the respondents and enable more real information to be collected.Moreover, the item-9 of PHQ-9 is frequently utilized in screening of SI among cancer pateints. 32,33lthough the C-SSRS is the standard for assessing suicide risk, it also has some limitations, such as lower sensitivity to suicide risk. 34here are two limitations in this study.First, although 10 cancer centers were included in this multicenter study, the sample size of each center is still relatively limited, and further validation of our model is still needed in a larger sample size.Second, there is a lack of full life cycle follow-up, so the ability of this model to predict actual suicidal behavior is still uncertain.Third, some potential variables, such as urban/rural residence status and financial burden, were not collected in this study.The multicenter sites involved in this study are all provincial-level cancer centers.Patients seeking treatment at these centers generally have economic means beyond the basic threshold, and currently, medical insurance covers both urban and rural areas.Financial burden is a sensitive variable that is not easily accessible.These variables should be considered in future studies.
In summary, we built an easy-to-use, good-performance prediction model for SI among patients with common advanced cancers based on multi-center real-world data and also proposed the criteria for SI risk levels.A userfriendly online prediction tool (https:// risko fsi.shiny apps.io/ rpubl ish/ ) has been developed, featuring an interactive interface for parameter input and graphic results display (Figure S4).This work provides a useful tool for future clinical oncology multidisciplinary care, to improve cancer patients' quality of life and reduce the risk of suicide behavior.

| 3 of 11 HE
et al.

F I G U R E 3 F I G U R E 4
The area under curves (AUCs) of the risk prediction model in the internal validation set within each hospital and cancer site.Detection rate of assumptive tailored screening with different cutoffs when the established prediction model was applied in all patients and patients with different cancer site.
T A B L E 1Abbreviations: HADS, hospital anxiety and depression scale; IQR, interquartile range; ISI, insomnia severity index; MDASI, MD Anderson Symptom Inventory.
Structure of the prediction model for predicting suicidal ideation on 2814 patients with advanced malignant tumors from 10 hospitals in China.
T A B L E 2Abbreviations: HADS, Hospital Anxiety and Depression Scale; IQR, interquartile range; MDASI, MD Anderson Symptom Inventory.aA two-phase selection based on logistic regression model and backward elimination under Akaike Information Criterion (AIC) was used to determine the final predictor panel.Only variables included in the final prediction model are shown in this table.bThe 3rd imputed dataset was used to calculate the proportion and median value of predictors.